Multi-Dimensional Database Allocation for Parallel Data Warehouses

نویسندگان

  • Thomas Stöhr
  • Holger Märtens
  • Erhard Rahm
چکیده

Data allocation is a key performance factor for parallel database systems (PDBS). This holds especially for data warehousing environments where huge amounts of data and complex analytical queries have to be dealt with. While there are several studies on data allocation for relational PDBS, the specific requirements of data warehouses have not yet been sufficiently addressed. In this study, we consider the allocation of relational data warehouses based on a star schema and utilizing bitmap index structures. We investigate how a multi-dimensional hierarchical data fragmentation of the fact table supports queries referencing different subsets of the schema dimensions. Our analysis is based on realistic parameters derived from a decision support benchmark. The performance implications of different allocation choices are evaluated by means of a detailed simulation model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WARLOCK: A Data Allocation Tool for Parallel Warehouses

We present the WARLOCK tool to automatically determine a parallel data warehouse’s allocation to disk. This GUIequipped tool is implemented in Java and utilizes an internal cost model and heuristics to determine a disk allocation minimizing both I/O work and query response times. WARLOCK recommends a ranked list of fragmentation candidates, a detailed query performance analysis and a tailored p...

متن کامل

A Parallel Scalable Infrastructure for OLAP and Data Mining

Decision support systems are important in leveraging information present in data warehouses in businesses like banking, insurance, retail and health-care among many others. The multi-dimensional aspects of a business can be naturally expressed using a multi-dimensional data model. Data analysis and data mining on these warehouses pose new challenges for traditional database systems. OLAP and da...

متن کامل

Online Data Mining

INTRODUCTION Currently, most data warehouses are being used for summarizationbased, multi-dimensional, online analytical processing (OLAP). However, given the recent developments in data warehouse and online analytical processing technology, together with the rapid progress in data mining research, industry analysts anticipate that organizations will soon be using their data warehouses for soph...

متن کامل

Fuzzy multi-criteria selection procedures in choosing data source

Technology assessment and selection has a substantial impact on organizations procedures in regards to technology transfer. Technological decisions are usually made by a group of experts, and whereby integrity of these viewpoints to a single decision can be quite complex. Today, operational databases and data warehouses exist to manage and organize data with specific features and henceforth, th...

متن کامل

Caching for Multi-dimensional Data Mining Queries

Multi-dimensional data analysis and online analytical processing are standard querying techniques applied on today’s data warehouses. Data mining algorithms, on the other hand, are still mostly run in stand-alone, batch mode on flat files extracted from relational databases. In this paper we propose a general querying model combining the power of relational databases, SQL, multidimensional quer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000